Nonparametric methods for identifying differentially expressed genes in microarray data
نویسندگان
چکیده
MOTIVATION Gene expression experiments provide a fast and systematic way to identify disease markers relevant to clinical care. In this study, we address the problem of robust identification of differentially expressed genes from microarray data. Differentially expressed genes, or discriminator genes, are genes with significantly different expression in two user-defined groups of microarray experiments. We compare three model-free approaches: (1). nonparametric t-test, (2). Wilcoxon (or Mann-Whitney) rank sum test, and (3). a heuristic method based on high Pearson correlation to a perfectly differentiating gene ('ideal discriminator method'). We systematically assess the performance of each method based on simulated and biological data under varying noise levels and p-value cutoffs. RESULTS All methods exhibit very low false positive rates and identify a large fraction of the differentially expressed genes in simulated data sets with noise level similar to that of actual data. Overall, the rank sum test appears most conservative, which may be advantageous when the computationally identified genes need to be tested biologically. However, if a more inclusive list of markers is desired, a higher p-value cutoff or the nonparametric t-test may be appropriate. When applied to data from lung tumor and lymphoma data sets, the methods identify biologically relevant differentially expressed genes that allow clear separation of groups in question. Thus the methods described and evaluated here provide a convenient and robust way to identify differentially expressed genes for further biological and clinical analysis.
منابع مشابه
Identification of differentially expressed gene categories in microarray studies using nonparametric multivariate analysis
MOTIVATION The field of microarray data analysis is shifting emphasis from methods for identifying differentially expressed genes to methods for identifying differentially expressed gene categories. The latter approaches utilize a priori information about genes to group genes into categories and enhance the interpretation of experiments aimed at identifying expression differences across treatme...
متن کاملExtracellular exosomes and preeclampsia: a microarray-based study and functional enrichment analysis
Background: Preeclampsia (PE) is a heterogeneous pregnancy disease which the exact pathophysiology of it is unknown. Recently exosomes have been indicated as a causative factor in the pathogenesis of PE. The aim of the study was to investigate in microarray library data to extract the differentially expressed genes (DEGs) in PE and to perform a functional enrichment analysis to predict the rol...
متن کاملUsing RNA-seq Data to Detect Differentially Expressed Genes
RNA-sequencing (RNA-seq) technology has become a major choice in detecting differentially expressed genes across different biological conditions. Although microarray technology is used for the same purpose, statistical methods available for identifying differential expression for microarray data are generally not readily applicable to the analysis of RNA-seq data, as RNA-seq data comprise discr...
متن کاملThe False Discovery Rate in Simultaneous Fisher and Adjusted Permutation Hypothesis Testing on Microarray Data
Background and Objectives: In recent years, new technologies have led to produce a large amount of data and in the field of biology, microarray technology has also dramatically developed. Meanwhile, the Fisher test is used to compare the control group with two or more experimental groups and also to detect the differentially expressed genes. In this study, the false discovery rate was investiga...
متن کاملIdentifying genes altered by a drug in temporal microarray data: A case study
DNA microarrays are suited for parallel analysis of gene expression across different tissues or populations of cells. An important application of microarray techniques is identifying genes altered by a particular drug of interest. This process allow biologists to target drug therapies to particular diseases, and, eventually, to gain more knowledge about the biological processes responsible for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 18 11 شماره
صفحات -
تاریخ انتشار 2002